1. Introduction to Plotly Plotly is much more fun to play with. You can zoom in, hover over points for details, and explore data without generating a dozen static images.

Let’s first load the diamonds dataset.

data("diamonds")
head(diamonds)
## # A tibble: 6 × 10
##   carat cut       color clarity depth table price     x     y     z
##   <dbl> <ord>     <ord> <ord>   <dbl> <dbl> <int> <dbl> <dbl> <dbl>
## 1  0.23 Ideal     E     SI2      61.5    55   326  3.95  3.98  2.43
## 2  0.21 Premium   E     SI1      59.8    61   326  3.89  3.84  2.31
## 3  0.23 Good      E     VS1      56.9    65   327  4.05  4.07  2.31
## 4  0.29 Premium   I     VS2      62.4    58   334  4.2   4.23  2.63
## 5  0.31 Good      J     SI2      63.3    58   335  4.34  4.35  2.75
## 6  0.24 Very Good J     VVS2     62.8    57   336  3.94  3.96  2.48
  1. What Kind of Plot Should I Use?

Bar Plot: Great for categorical data. Scatter Plot: Perfect for two numerical variables. Histogram: Ideal for showing the distribution of one numeric variable.

Practice 1: Bar Plot (Categorical Data) Let’s start simple. A bar plot is great for comparing categories. Create a ggplot bar chart comparing the counts of diamonds by their cut quality.

# Create a ggplot bar chart
bar_plot <- ggplot(diamonds, aes(x = cut)) + 
  geom_bar() + 
  ggtitle("Diamonds by Cut Quality") +
  theme_minimal()

bar_plot

Question: What do you notice about the distribution of cuts? Seems like more people go for ‘Ideal’ cuts!

Now, let’s add some bling by converting this into a plotly plot.

# Convert to interactive plot
ggplotly(bar_plot)

Look! You can now hover over the bars and see exactly how many diamonds there are in each category. Much shinier!

Practice 2: Scatter Plot (Two Numeric Variables) Scatter plots are perfect for showing relationships between two numerical variables. Let’s check out the relationship between carat (weight) and price. Is bigger always better?

# Create a ggplot scatter plot
scatter_plot <- ggplot(diamonds, aes(x = carat, y = price)) + 
  geom_point(alpha = 0.5, color = "blue") + 
  ggtitle("Carat vs Price") +
  theme_minimal()

scatter_plot

Now, convert this to a Plotly plot so you can zoom in and get up close with those high-priced diamonds!

# Convert to interactive plot
ggplotly(scatter_plot)

Question: What can you infer from this scatter plot? Does it seem like bigger diamonds (higher carats) cost more? But notice the steep jump in prices for certain diamonds.

Practice 3: Histogram (Distribution of Numeric Data) Histograms help you see the distribution of a single numeric variable. Let’s check out the distribution of diamond prices. Ready to be shocked?

# Create a ggplot histogram
histogram <- ggplot(diamonds, aes(x = price)) + 
  geom_histogram(binwidth = 1000, fill = "green", color = "black") +
  ggtitle("Distribution of Diamond Prices") +
  theme_minimal()

histogram

Make it interactive to explore those outlier prices!

# Convert to interactive plot
ggplotly(histogram)

Challenge: Try adjusting the binwidth in the histogram and see how it changes the shape of the distribution. What happens when you make the binwidth smaller or larger?

Bonus Practice: Add More Bling (Customization) The cool thing about Plotly is that you can keep customizing. Let’s spice up our scatter plot by adding color based on the diamond’s clarity.

# Create a customized ggplot scatter plot
scatter_plot_colored <- ggplot(diamonds, aes(x = carat, y = price, color = clarity)) + 
  geom_point(alpha = 0.5) + 
  ggtitle("Carat vs Price (Colored by Clarity)") +
  theme_minimal()

# Convert to interactive plot
ggplotly(scatter_plot_colored)

Now you can see the relationship between price, carat, and clarity interactively! Hover over the points to discover the clarity of those shiny diamonds. 7.Faceting: Viewing Data Across Categories Faceting is an effective way to create separate panels within a single plot for different categories. For instance, if you want to examine how diamond prices vary across different levels of clarity, faceting can help.

Let’s create a faceted scatter plot showing the relationship between carat and price, separated by diamond

# Create a ggplot scatter plot with facets
facet_plot <- ggplot(diamonds, aes(x = carat, y = price)) + 
  geom_point(alpha = 0.3, color = "purple") + 
  facet_wrap(~ cut) + 
  ggtitle("Carat vs Price Faceted by Cut") +
  theme_minimal()

facet_plot

Now, let’s make this faceted plot interactive

# Convert to interactive plot
ggplotly(facet_plot)
  1. Adding Trend Lines: Understanding Patterns Adding trend lines to scatter plots can help reveal underlying patterns, such as the general increase in diamond price as carat increases. Let’s add a trend line to see the overall relationship between carat and price.
# Create a scatter plot with a trend line
trend_plot <- ggplot(diamonds, aes(x = carat, y = price)) + 
  geom_point(alpha = 0.3, color = "blue") + 
  geom_smooth(method = "lm", color = "red", se = FALSE) + 
  ggtitle("Carat vs Price with Trend Line") +
  theme_minimal()

trend_plot
## `geom_smooth()` using formula = 'y ~ x'

Convert this to an interactive plot:

# Convert to interactive plot
ggplotly(trend_plot)
## `geom_smooth()` using formula = 'y ~ x'
  1. Wrapping Up Plotly is a fantastic way to make your visualizations more interactive and fun. You can zoom, hover, and explore your data in ways that static plots simply don’t allow. Plus, converting your ggplot plots to plotly is as easy as wrapping them in ggplotly()!

In this tutorial, you learned how to: